Extraction of Reward-Related Feature Space Using Correlation-Based and Reward-Based Learning Methods
نویسندگان
چکیده
The purpose of this article is to present a novel learning paradigm that extracts reward-related low-dimensional state space by combining correlation-based learning like Input Correlation Learning (ICO learning) and reward-based learning like Reinforcement Learning (RL). Since ICO learning can quickly find a correlation between a state and an unwanted condition (e.g., failure), we use it to extract low-dimensional feature space in which we can find a failure avoidance policy. Then, the extracted feature space is used as a prior for RL. If we can extract proper feature space for a given task, a model of the policy can be simple and the policy can be easily improved. The performance of this learning paradigm is evaluated through simulation of a cart-pole system. As a result, we show that the proposed method can enhance the feature extraction process to find the proper feature space for a pole balancing policy. That is it allows a policy to effectively stabilize the pole in the largest domain of initial conditions compared to only using ICO learning or only using RL without any prior knowledge.
منابع مشابه
Improving the Operation of Text Categorization Systems with Selecting Proper Features Based on PSO-LA
With the explosive growth in amount of information, it is highly required to utilize tools and methods in order to search, filter and manage resources. One of the major problems in text classification relates to the high dimensional feature spaces. Therefore, the main goal of text classification is to reduce the dimensionality of features space. There are many feature selection methods. However...
متن کاملAn Analysis of Feature Selection and Reward Function for Model-Based Reinforcement Learning
In this paper, we propose a series of correlation-based feature selection methods for dealing with high dimensionality in feature-rich environments for modelbased Reinforcement Learning (RL). Real world RL tasks usually involve highdimensional feature spaces where standard RL methods often perform badly. Our proposed approach adopts correlation among state features as a selection criterion. The...
متن کامل结合Tile Coding的平均奖赏学习算法 An Average Learning Algorithm with Tile Coding
Average reward reinforcement learning is an important undiscounted optimality framework.It tries to learn policy by maximizing the long term average reward,and it is more appropriate for cyclical tasks than the discounted framework. Researchers have presented various average reward methods with lots of experiments to verify their validity.However,most of the work was based on discrete state spa...
متن کاملThe Pattern of Structural Relationships of Relapse of Individuals with Substance Use Disorder based on Attentional Bias and Reward Sensitivity with the Mediating Role of Inhibition Control
Objective: The aim of this study was to investigate the pattern of structural relationships of relapse in individuals with substance use disorder based on attentional bias and reward sensitivity with the mediating role of inhibition control. Method: The present study was descriptive-correlation of structural equation modeling type. The statistical population of this study included all withdrawi...
متن کاملCompatible Reward Inverse Reinforcement Learning
PROBLEM • Inverse Reinforcement Learning (IRL) problem: recover a reward function explaining a set of expert’s demonstrations. • Advantages of IRL over Behavioral Cloning (BC): – Transferability of the reward. • Issues with some IRL methods: – How to build the features for the reward function? – How to select a reward function among all the optimal ones? – What if no access to the environment? ...
متن کامل